72 research outputs found

    Super-Fast Distributed Algorithms for Metric Facility Location

    Full text link
    This paper presents a distributed O(1)-approximation algorithm, with expected-O(loglogn)O(\log \log n) running time, in the CONGEST\mathcal{CONGEST} model for the metric facility location problem on a size-nn clique network. Though metric facility location has been considered by a number of researchers in low-diameter settings, this is the first sub-logarithmic-round algorithm for the problem that yields an O(1)-approximation in the setting of non-uniform facility opening costs. In order to obtain this result, our paper makes three main technical contributions. First, we show a new lower bound for metric facility location, extending the lower bound of B\u{a}doiu et al. (ICALP 2005) that applies only to the special case of uniform facility opening costs. Next, we demonstrate a reduction of the distributed metric facility location problem to the problem of computing an O(1)-ruling set of an appropriate spanning subgraph. Finally, we present a sub-logarithmic-round (in expectation) algorithm for computing a 2-ruling set in a spanning subgraph of a clique. Our algorithm accomplishes this by using a combination of randomized and deterministic sparsification.Comment: 15 pages, 2 figures. This is the full version of a paper that appeared in ICALP 201

    On the Analysis of a Label Propagation Algorithm for Community Detection

    Full text link
    This paper initiates formal analysis of a simple, distributed algorithm for community detection on networks. We analyze an algorithm that we call \textsc{Max-LPA}, both in terms of its convergence time and in terms of the "quality" of the communities detected. \textsc{Max-LPA} is an instance of a class of community detection algorithms called \textit{label propagation} algorithms. As far as we know, most analysis of label propagation algorithms thus far has been empirical in nature and in this paper we seek a theoretical understanding of label propagation algorithms. In our main result, we define a clustered version of \er random graphs with clusters V1,V2,...,VkV_1, V_2,..., V_k where the probability pp, of an edge connecting nodes within a cluster ViV_i is higher than pp', the probability of an edge connecting nodes in distinct clusters. We show that even with fairly general restrictions on pp and pp' (p=Ω(1n1/4ϵ)p = \Omega(\frac{1}{n^{1/4-\epsilon}}) for any ϵ>0\epsilon > 0, p=O(p2)p' = O(p^2), where nn is the number of nodes), \textsc{Max-LPA} detects the clusters V1,V2,...,VnV_1, V_2,..., V_n in just two rounds. Based on this and on empirical results, we conjecture that \textsc{Max-LPA} can correctly and quickly identify communities on clustered \er graphs even when the clusters are much sparser, i.e., with p=clognnp = \frac{c\log n}{n} for some c>1c > 1.Comment: 17 pages. Submitted to ICDCN 201

    Analysis of the Worst Case Space Complexity of a PR Quadtree

    Get PDF
    We demonstrate that a resolution-r PR quadtree containing n points has, in the worst case, at most nodes. This captures the fact that as n tends towards 4r, the number of nodes in a PR quadtree quickly approaches O(n). This is a more precise estimation of the worst case space requirement of a PR quadtree than has been attempted before

    Super-Fast MST Algorithms in the Congested Clique Using o(m) Messages

    Get PDF
    In a sequence of recent results (PODC 2015 and PODC 2016), the running time of the fastest algorithm for the minimum spanning tree (MST) problem in the Congested Clique model was first improved to O(log(log(log(n)))) from O(log(log(n))) (Hegeman et al., PODC 2015) and then to O(log^*(n)) (Ghaffari and Parter, PODC 2016). All of these algorithms use Theta(n^2) messages independent of the number of edges in the input graph. This paper positively answers a question raised in Hegeman et al., and presents the first "super-fast" MST algorithm with o(m) message complexity for input graphs with m edges. Specifically, we present an algorithm running in O(log^*(n)) rounds, with message complexity ~O(sqrt{m * n}) and then build on this algorithm to derive a family of algorithms, containing for any epsilon, 0 < epsilon <= 1, an algorithm running in O(log^*(n)/epsilon) rounds, using ~O(n^{1 + epsilon}/epsilon) messages. Setting epsilon = log(log(n))/log(n) leads to the first sub-logarithmic round Congested Clique MST algorithm that uses only ~O(n) messages. Our primary tools in achieving these results are (i) a component-wise bound on the number of candidates for MST edges, extending the sampling lemma of Karger, Klein, and Tarjan (Karger, Klein, and Tarjan, JACM 1995) and (ii) Theta(log(n))-wise-independent linear graph sketches (Cormode and Firmani, Dist. Par. Databases, 2014) for generating MST candidate edges

    Sample-And-Gather: Fast Ruling Set Algorithms in the Low-Memory MPC Model

    Get PDF
    Motivated by recent progress on symmetry breaking problems such as maximal independent set (MIS) and maximal matching in the low-memory Massively Parallel Computation (MPC) model (e.g., Behnezhad et al. PODC 2019; Ghaffari-Uitto SODA 2019), we investigate the complexity of ruling set problems in this model. The MPC model has become very popular as a model for large-scale distributed computing and it comes with the constraint that the memory-per-machine is strongly sublinear in the input size. For graph problems, extremely fast MPC algorithms have been designed assuming ??(n) memory-per-machine, where n is the number of nodes in the graph (e.g., the O(log log n) MIS algorithm of Ghaffari et al., PODC 2018). However, it has proven much more difficult to design fast MPC algorithms for graph problems in the low-memory MPC model, where the memory-per-machine is restricted to being strongly sublinear in the number of nodes, i.e., O(n^?) for constant 0 < ? < 1. In this paper, we present an algorithm for the 2-ruling set problem, running in O?(log^{1/6} ?) rounds whp, in the low-memory MPC model. Here ? is the maximum degree of the graph. We then extend this result to ?-ruling sets for any integer ? > 1. Specifically, we show that a ?-ruling set can be computed in the low-memory MPC model with O(n^?) memory-per-machine in O?(? ? log^{1/(2^{?+1}-2)} ?) rounds, whp. From this it immediately follows that a ?-ruling set for ? = ?(log log log ?)-ruling set can be computed in in just O(? log log n) rounds whp. The above results assume a total memory of O?(m + n^{1+?}). We also present algorithms for ?-ruling sets in the low-memory MPC model assuming that the total memory over all machines is restricted to O?(m). For ? > 1, these algorithms are all substantially faster than the Ghaffari-Uitto O?(?{log ?})-round MIS algorithm in the low-memory MPC model. All our results follow from a Sample-and-Gather Simulation Theorem that shows how random-sampling-based Congest algorithms can be efficiently simulated in the low-memory MPC model. We expect this simulation theorem to be of independent interest beyond the ruling set algorithms derived here

    Large-Scale Distributed Algorithms for Facility Location with Outliers

    Get PDF
    This paper presents fast, distributed, O(1)-approximation algorithms for metric facility location problems with outliers in the Congested Clique model, Massively Parallel Computation (MPC) model, and in the k-machine model. The paper considers Robust Facility Location and Facility Location with Penalties, two versions of the facility location problem with outliers proposed by Charikar et al. (SODA 2001). The paper also considers two alternatives for specifying the input: the input metric can be provided explicitly (as an n x n matrix distributed among the machines) or implicitly as the shortest path metric of a given edge-weighted graph. The results in the paper are: - Implicit metric: For both problems, O(1)-approximation algorithms running in O(poly(log n)) rounds in the Congested Clique and the MPC model and O(1)-approximation algorithms running in O~(n/k) rounds in the k-machine model. - Explicit metric: For both problems, O(1)-approximation algorithms running in O(log log log n) rounds in the Congested Clique and the MPC model and O(1)-approximation algorithms running in O~(n/k) rounds in the k-machine model. Our main contribution is to show the existence of Mettu-Plaxton-style O(1)-approximation algorithms for both Facility Location with outlier problems. As shown in our previous work (Berns et al., ICALP 2012, Bandyapadhyay et al., ICDCN 2018) Mettu-Plaxton style algorithms are more easily amenable to being implemented efficiently in distributed and large-scale models of computation

    The Complexity of Symmetry Breaking in Massive Graphs

    Get PDF
    The goal of this paper is to understand the complexity of symmetry breaking problems, specifically maximal independent set (MIS) and the closely related beta-ruling set problem, in two computational models suited for large-scale graph processing, namely the k-machine model and the graph streaming model. We present a number of results. For MIS in the k-machine model, we improve the O~(m/k^2 + Delta/k)-round upper bound of Klauck et al. (SODA 2015) by presenting an O~(m/k^2)-round algorithm. We also present an Omega~(n/k^2) round lower bound for MIS, the first lower bound for a symmetry breaking problem in the k-machine model. For beta-ruling sets, we use hierarchical sampling to obtain more efficient algorithms in the k-machine model and also in the graph streaming model. More specifically, we obtain a k-machine algorithm that runs in O~(beta n Delta^{1/beta}/k^2) rounds and, by using a similar hierarchical sampling technique, we obtain one-pass algorithms for both insertion-only and insertion-deletion streams that use O(beta * n^{1+1/2^{beta-1}}) space. The latter result establishes a clear separation between MIS, which is known to require Omega(n^2) space (Cormode et al., ICALP 2019), and beta-ruling sets, even for beta = 2. Finally, we present an even faster 2-ruling set algorithm in the k-machine model, one that runs in O~(n/k^{2-epsilon} + k^{1-epsilon}) rounds for any epsilon, 0 <=epsilon <=1. For a wide range of values of k this round complexity simplifies to O~(n/k^2) rounds, which we conjecture is optimal. Our results use a variety of techniques. For our upper bounds, we prove and use simulation theorems for beeping algorithms, hierarchical sampling, and L_0-sampling, whereas for our lower bounds we use information-theoretic arguments and reductions to 2-party communication complexity problems

    New Results for the Minimum Weight Triangulation Problem

    Get PDF
    The current best polynomial time approximation algorithm produces a triangulation that can be O(log n) times the weight of the optimal triangulation. We propose an algorithm that triangulates a set P of n points in a plane in O(n3) time and that never does worse than the greedy triangulation. We investigate issues of local optimality pertaining to known triangulation algorithms and suggest an interesting new approach to studying triangulation algorithms. We restate the minimum weight triangulation problem as a graph problem and show the NP-hardness of a closely related graph problem. Finally, we show that the constrained problem of computing the minimum weight triangulation, given a set of points in a plane and enough edges to form a triangulation, is NP-hard. These results are an advance towards a proof that the minimum weight triangulation problem is NP-hard
    corecore